Increasing the Classification Accuracy of Simple Bayesian Classifier

نویسندگان

  • Sotiris B. Kotsiantis
  • Panayiotis E. Pintelas
چکیده

Simple Bayes algorithm captures the assumption that every feature is independent from the rest of the features, given the state of the class feature. The fact that the assumption of independence is clearly almost always wrong has led to a general rejection of the crude independence model in favor of more complicated alternatives, at least by researchers knowledgeable about theoretical issues. In this study, we attempted to increase the prediction accuracy of the simple Bayes model. Because the concept of combining classifiers is proposed as a new direction for the improvement of the performance of individual classifiers, we made use of Adaboost, with the difference that in each iteration of Adaboost, we used a discretization method and we removed redundant features using a filter feature selection method. Finally, we performed a large-scale comparison with other attempts that have tried to improve the accuracy of the simple Bayes algorithm as well as other state-of-the-art algorithms and ensembles on 26 standard benchmark datasets and we took better accuracy in most cases using less time for training, too.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents

Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...

متن کامل

Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents

Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...

متن کامل

ارتقای کیفیت دسته‌بندی متون با استفاده از کمیته‌ دسته‌بند دو سطحی

Nowadays, the automated text classification has witnessed special importance due to the increasing availability of documents in digital form and ensuing need to organize them. Although this problem is in the Information Retrieval (IR) field, the dominant approach is based on machine learning techniques. Approaches based on classifier committees have shown a better performance than the others. I...

متن کامل

Feature Selection for Ensembles of Simple Bayesian Classifiers

A popular method for creating an accurate classifier from a set of training data is to train several classifiers, and then to combine their predictions. The ensembles of simple Bayesian classifiers have traditionally not been a focus of research. However, the simple Bayesian classifier has much broader applicability than previously thought. Besides its high classification accuracy, it also has ...

متن کامل

Logitboost of Multinomial Bayesian Classifier for Text Classification

Automated text classification has been considered as a vital method to manage and process a vast amount of documents in digital forms that are widespread and continuously increasing. In general, text classification plays an important role in information extraction and summarization, text retrieval, and question-answering. The Multinomial Bayesian Classifier has traditionally been a focus of res...

متن کامل

Exploring Case-Based Bayesian Networks and Bayesian Multi-nets for Classification

Recent work in Bayesian classifiers has shown that a better and more flexible representation of domain knowledge results in better classification accuracy. In previous work [1], we have introduced a new type of Bayesian classifier called Case-Based Bayesian Network (CBBN) classifiers. We have shown that CBBNs can capture finer levels of semantics than possible in traditional Bayesian Networks (...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004